在社会科学和企业中观测数据的分析中,难以获得“(准)单源数据集”,其中同时观察到感兴趣的变量。相反,通常针对不同的个体或单位获取多源数据集。已经提出了各种方法来研究每个数据集中的变量之间的关系,例如匹配和潜在的变量建模。有必要利用这些数据集作为具有缺失变量的单源数据集。现有方法假设要集成的数据集是从相同的人群中获取,或者采样取决于协变量。在缺失方面,这种假设被称为随机(MAR)缺失。然而,正如在应用研究中所示的那样,这一假设可能不会在实际数据分析中保持,并且获得的结果可能偏置。我们提出了一种数据融合方法,不认为数据集是均匀的。我们使用用于非MAR缺失数据的高斯过程潜变量模型。该模型假设关注的变量和缺失的概率取决于潜在变量。模拟研究和实际数据分析表明,具有缺失数据机制和潜在高斯过程的提出方法产生有效估计,而现有方法提供严重偏置的估计。这是第一研究,其中在数据融合问题中的可谐振假设下考虑并解决了对数据集的非随机分配。
translated by 谷歌翻译
The development of deep neural networks has improved representation learning in various domains, including textual, graph structural, and relational triple representations. This development opened the door to new relation extraction beyond the traditional text-oriented relation extraction. However, research on the effectiveness of considering multiple heterogeneous domain information simultaneously is still under exploration, and if a model can take an advantage of integrating heterogeneous information, it is expected to exhibit a significant contribution to many problems in the world. This thesis works on Drug-Drug Interactions (DDIs) from the literature as a case study and realizes relation extraction utilizing heterogeneous domain information. First, a deep neural relation extraction model is prepared and its attention mechanism is analyzed. Next, a method to combine the drug molecular structure information and drug description information to the input sentence information is proposed, and the effectiveness of utilizing drug molecular structures and drug descriptions for the relation extraction task is shown. Then, in order to further exploit the heterogeneous information, drug-related items, such as protein entries, medical terms and pathways are collected from multiple existing databases and a new data set in the form of a knowledge graph (KG) is constructed. A link prediction task on the constructed data set is conducted to obtain embedding representations of drugs that contain the heterogeneous domain information. Finally, a method that integrates the input sentence information and the heterogeneous KG information is proposed. The proposed model is trained and evaluated on a widely used data set, and as a result, it is shown that utilizing heterogeneous domain information significantly improves the performance of relation extraction from the literature.
translated by 谷歌翻译
To simulate bosons on a qubit- or qudit-based quantum computer, one has to regularize the theory by truncating infinite-dimensional local Hilbert spaces to finite dimensions. In the search for practical quantum applications, it is important to know how big the truncation errors can be. In general, it is not easy to estimate errors unless we have a good quantum computer. In this paper we show that traditional sampling methods on classical devices, specifically Markov Chain Monte Carlo, can address this issue with a reasonable amount of computational resources available today. As a demonstration, we apply this idea to the scalar field theory on a two-dimensional lattice, with a size that goes beyond what is achievable using exact diagonalization methods. This method can be used to estimate the resources needed for realistic quantum simulations of bosonic theories, and also, to check the validity of the results of the corresponding quantum simulations.
translated by 谷歌翻译
Hyperparameter optimization (HPO) is essential for the better performance of deep learning, and practitioners often need to consider the trade-off between multiple metrics, such as error rate, latency, memory requirements, robustness, and algorithmic fairness. Due to this demand and the heavy computation of deep learning, the acceleration of multi-objective (MO) optimization becomes ever more important. Although meta-learning has been extensively studied to speedup HPO, existing methods are not applicable to the MO tree-structured parzen estimator (MO-TPE), a simple yet powerful MO-HPO algorithm. In this paper, we extend TPE's acquisition function to the meta-learning setting, using a task similarity defined by the overlap in promising domains of each task. In a comprehensive set of experiments, we demonstrate that our method accelerates MO-TPE on tabular HPO benchmarks and yields state-of-the-art performance. Our method was also validated externally by winning the AutoML 2022 competition on "Multiobjective Hyperparameter Optimization for Transformers".
translated by 谷歌翻译
Deformable registration of two-dimensional/three-dimensional (2D/3D) images of abdominal organs is a complicated task because the abdominal organs deform significantly and their contours are not detected in two-dimensional X-ray images. We propose a supervised deep learning framework that achieves 2D/3D deformable image registration between 3D volumes and single-viewpoint 2D projected images. The proposed method learns the translation from the target 2D projection images and the initial 3D volume to 3D displacement fields. In experiments, we registered 3D-computed tomography (CT) volumes to digitally reconstructed radiographs generated from abdominal 4D-CT volumes. For validation, we used 4D-CT volumes of 35 cases and confirmed that the 3D-CT volumes reflecting the nonlinear and local respiratory organ displacement were reconstructed. The proposed method demonstrate the compatible performance to the conventional methods with a dice similarity coefficient of 91.6 \% for the liver region and 85.9 \% for the stomach region, while estimating a significantly more accurate CT values.
translated by 谷歌翻译
Mobile stereo-matching systems have become an important part of many applications, such as automated-driving vehicles and autonomous robots. Accurate stereo-matching methods usually lead to high computational complexity; however, mobile platforms have only limited hardware resources to keep their power consumption low; this makes it difficult to maintain both an acceptable processing speed and accuracy on mobile platforms. To resolve this trade-off, we herein propose a novel acceleration approach for the well-known zero-means normalized cross correlation (ZNCC) matching cost calculation algorithm on a Jetson Tx2 embedded GPU. In our method for accelerating ZNCC, target images are scanned in a zigzag fashion to efficiently reuse one pixel's computation for its neighboring pixels; this reduces the amount of data transmission and increases the utilization of on-chip registers, thus increasing the processing speed. As a result, our method is 2X faster than the traditional image scanning method, and 26% faster than the latest NCC method. By combining this technique with the domain transformation (DT) algorithm, our system show real-time processing speed of 32 fps, on a Jetson Tx2 GPU for 1,280x384 pixel images with a maximum disparity of 128. Additionally, the evaluation results on the KITTI 2015 benchmark show that our combined system is more accurate than the same algorithm combined with census by 7.26%, while maintaining almost the same processing speed.
translated by 谷歌翻译
通常,通过聚类或订购将标签分配给每个元素,通常可以分析关系数据集。尽管通过聚类和排序方法可以实现数据集的类似表征,但前者比后者更积极地研究了数据集,尤其是对于表示为图的数据。这项研究通过研究几种聚类和订购方法之间的方法学关系来填补这一空白,重点是光谱技术。此外,我们评估了聚类和订购方法的结果性能。为此,我们提出了一种称为标签连续性误差的度量,该度量通常量化了一组元素的序列和分区之间的一致性程度。基于合成和现实世界数据集,我们评估了订购方法标识模块结构和聚类方法标识带状结构的范围。
translated by 谷歌翻译
我们提出Unrealego,即,一种用于以Egentric 3D人类姿势估计的新的大规模自然主义数据集。Unrealego是基于配备两个鱼眼摄像机的眼镜的高级概念,可用于无约束的环境。我们设计了它们的虚拟原型,并将其附加到3D人体模型中以进行立体视图捕获。接下来,我们会产生大量的人类动作。结果,Unrealego是第一个在现有的EgeCentric数据集中提供最大动作的野外立体声图像的数据集。此外,我们提出了一种新的基准方法,其简单但有效的想法是为立体声输入设计2D关键点估计模块,以改善3D人体姿势估计。广泛的实验表明,我们的方法在定性和定量上优于先前的最新方法。Unrealego和我们的源代码可在我们的项目网页上找到。
translated by 谷歌翻译
由于学习过程中缺乏安全保证,在网络物理系统中使用加固学习(RL)是具有挑战性的。尽管有各种建议在学习过程中减少不希望的行为,但这些技术中的大多数都需要先前的系统知识,并且其适用性是有限的。本文旨在减少学习过程中不希望的行为,而无需任何先前的系统知识。我们提出动态屏蔽:基于自动机学习的基于模型的安全RL技术的扩展。动态屏蔽技术使用RPNI算法的变体和RL平行构建近似系统模型,并由于学习模型构建的屏蔽而抑制了不希望的探索。通过这种组合,在代理商体验他们之前,可以预见潜在的不安全行动。实验表明,我们的动态盾牌可显着减少训练过程中不希望的事件的数量。
translated by 谷歌翻译
我们提出了一种从普通X射线图像中估算骨矿物质密度(BMD)的方法。双能X射线吸收法(DXA)和定量计算机断层扫描(QCT)在诊断骨质疏松症方面具有很高的精度;但是,这些方式需要特殊的设备和扫描协议。测量X射线图像的BMD提供了机会筛查,这对于早期诊断可能有用。先前直接了解X射线图像和BMD之间关系的方法需要大型训练数据集,以实现高精度,因为X射线图像中的强度很大。因此,我们提出了一种使用QCT训练生成对抗网络(GAN)的方法,并将X射线图像分解为骨分割QCT的投影。提出的分层学习提高了定量分解小区域目标的鲁棒性和准确性。使用拟议的方法对200例骨关节炎评估,我们将其命名为BMD-GAN,在预测和地面真实DXA测量的BMD之间显示出Pearson相关系数为0.888。除了不需要大规模训练数据库外,我们方法的另一个优点是它的扩展性对其他解剖区域,例如椎骨和肋骨。
translated by 谷歌翻译